Goto

Collaborating Authors

 gaussian splatting


Reframing Gaussian Splatting Densification with Complexity-Density Consistency of Primitives

Neural Information Processing Systems

The essence of 3DGaussian Splatting (3DGS) training is to smartly allocate Gaussian primitives, expressing complex regions with more primitives and vice versa. Prior researches typically mark out under-reconstructed regions in a renderingloss-driven manner. However, such a loss-driven strategy is often dominated by low-frequency regions, which leads to insufficient modeling of high-frequency details in texture-rich regions. As a result, it yields a suboptimal spatial allocation of Gaussian primitives. This inspires us to excavate the loss-agnostic visual prior in training views to identify complex regions that need more primitives to model.


TreeSplat: Mergeable Tree for Deformable Gaussian Splatting

Neural Information Processing Systems

Dynamic 3D scene reconstruction from multi-view videos demands representation to model complex deformations at scale. Current Gaussian Splatting based methods often either suffer from significant computation cost due to dense MLP-based modeling or explicit modeling deformation of each Gaussian independently. However, the dynamics of objects within a scene are typically hierarchical and exhibit structural correlations. To leverage these structural priors into the representation, we introduce TreeSplat, a Tree data structure for deformable Gaussian Splatting. In TreeSplat, as the name suggests, motions of Gaussian are represented hierarchically within a tree.


LangSplatV2: High-dimensional 3D Language Gaussian Splatting with 450+ FPS

Neural Information Processing Systems

In this paper, we introduce LangSplatV2, which achieves high-dimensional feature splatting at 476.2 FPS and 3D open-vocabulary text querying at 384.6 FPS for high-resolution images, providing a 42 speedup and a 47 boost over LangSplat respectively, along with improved query accuracy. LangSplat employs Gaussian Splatting to embed 2D CLIP language features into 3D, significantly enhancing speed and learning a precise 3D language field with SAM semantics. Such advancements in 3D language fields are crucial for applications that require language interaction within complex scenes. However, LangSplat does not yet achieve real-time performance (8.2 FPS), even with advanced A100 GPUs, severely limiting its broader application. In this paper, we first conduct a detailed time analysis of LangSplat, identifying the heavyweight decoder as the primary speed bottleneck. Our solution, LangSplatV2 assumes that each Gaussian acts as a sparse code within a global dictionary, leading to the learning of a 3D sparse coefficient field that entirely eliminates the need for a heavyweight decoder. By leveraging this sparsity, we further propose an efficient sparse coefficient splatting method with CUDA optimization, rendering high-dimensional feature maps at high quality while incurring only the time cost of splatting an ultra-low-dimensional feature. Our experimental results demonstrate that LangSplatV2 not only achieves better or competitive query accuracy but is also significantly faster. Codes and demos are available at our project page: https://langsplat-v2.github.io.


Orientation-anchored Hyper-Gaussian for 4D Reconstruction from Casual Videos

Neural Information Processing Systems

We present Orientation-anchored Gaussian Splatting (OriGS), a novel framework for high-quality 4D reconstruction from casually captured monocular videos. While recent advances extend 3D Gaussian Splatting to dynamic scenes via various motion anchors, such as graph nodes or spline control points, they often rely on low-rank assumptions and fall short in modeling complex, region-specific deformations inherent to unconstrained dynamics. OriGS addresses this by introducing a hyperdimensional representation grounded in scene orientation.


Temporal Smoothness-Aware Rate-Distortion Optimized 4D Gaussian Splatting

Neural Information Processing Systems

Dynamic 4D Gaussian Splatting (4DGS) effectively extends the high-speed rendering capabilities of 3D Gaussian Splatting (3DGS) to represent volumetric videos.


Reframing Gaussian Splatting Densification with Complexity-Density Consistency of Primitives

Neural Information Processing Systems

The essence of 3D Gaussian Splatting (3DGS) training is to smartly allocate Gaussian primitives, expressing complex regions with more primitives and vice versa. Prior researches typically mark out under-reconstructed regions in a rendering-loss-driven manner. However, such a loss-driven strategy is often dominated by low-frequency regions, which leads to insufficient modeling of high-frequency details in texture-rich regions. As a result, it yields a suboptimal spatial allocation of Gaussian primitives. This inspires us to excavate the loss-agnostic visual prior in training views to identify complex regions that need more primitives to model. Based on this insight, we propose Complexity-Density Consistent Gaussian Splatting (CDC-GS), which allocates primitives based on the consistency between visual complexity of training views and the density of primitives.


MaterialRefGS: Reflective Gaussian Splatting with Multi-view Consistent Material Inference

Neural Information Processing Systems

Modeling reflections from 2D images is essential for photorealistic rendering and novel view synthesis. Recent approaches enhance Gaussian primitives with reflection-related material attributes to enable physically based rendering (PBR) with Gaussian Splatting. However, the material inference often lacks sufficient constraints, especially under limited environment modeling, resulting in illumination aliasing and reduced generalization. In this work, we revisit the problem from a multi-view perspective and show that multi-view consistent material inference with more physically-based environment modeling is key to learning accurate reflections with Gaussian Splatting. To this end, we enforce 2D Gaussians to produce multi-view consistent material maps during deferred shading. We also track photometric variations across views to identify highly reflective regions, which serve as strong priors for reflection strength terms. To handle indirect illumination caused by inter-object occlusions, we further introduce an environment modeling strategy through ray tracing with 2DGS, enabling photorealistic rendering of indirect radiance. Experiments on widely used benchmarks show that our method faithfully recovers both illumination and geometry, achieving state-of-the-art rendering quality in novel views synthesis.


1000+ FPS 4D Gaussian Splatting for Dynamic Scene Rendering

Neural Information Processing Systems

Despite achieving superior quality, 4DGS typically requires substantial storage and suffers from slow rendering speed. In this work, we delve into these issues and identify two key sources of temporal redundancy.


Dynamic Gaussian Splatting from Defocused and Motion-blurred Monocular Videos

Neural Information Processing Systems

This paper presents a unified framework that allows high-quality dynamic Gaussian Splatting from both defocused and motion-blurred monocular videos. Due to the significant difference between the formation processes of defocus blur and motion blur, existing methods are tailored for either one of them, lacking the ability to simultaneously deal with both of them. Although the two can be jointly modeled as blur kernel-based convolution, the inherent difficulty in estimating accurate blur kernels greatly limits the progress in this direction. In this work, we go a step further towards this direction. Particularly, we propose to estimate per-pixel reliable blur kernels using a blur prediction network that exploits blur-related scene and camera information and is subject to a blur-aware sparsity constraint. Besides, we introduce a dynamic Gaussian densification strategy to mitigate the lack of Gaussians for incomplete regions, and boost the performance of novel view synthesis by incorporating unseen view information to constrain scene optimization. Extensive experiments show that our method outperforms the state-of-the-art methods in generating photorealistic novel view synthesis from defocused and motion-blurred monocular videos.


GS2E: Gaussian Splatting is an Effective Data Generator for Event Stream Generation

Neural Information Processing Systems

We introduce GS2E (Gaussian Splatting to Event Generation), a large-scale synthetic event dataset designed for high-fidelity event vision tasks, captured from real-world sparse multi-view RGB images. Existing event datasets are often synthesized from dense RGB videos, which typically suffer from limited viewpoint diversity and geometric inconsistency, or rely on expensive, hard-to-scale hardware setups. GS2E addresses these limitations by first reconstructing photorealistic static scenes using 3D Gaussian Splatting, followed by a novel, physically-informed event simulation pipeline.